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Strain R6-15 belongs to the genus Thalassolituus, in the family Oceanospirillaceae of Gammaproteobac- 
teria. Representatives of this genus are known to be the obligate hydrocarbonoclastic marine bac- 
teria. Thalassolituus oleivorans R6-15 is of special interest due to its dominance in the crude oil- 
degrading consortia enriched from the surface seawater of the Arctic Ocean. Here we describe the 
complete genome sequence and annotation of this strain, together with its phenotypic character- 
istics. The genome with size of 3,764,053 bp comprises one chromosome without any plasmids, 
and contains 3,372 protein-coding and 61 RNA genes, including 12 rRNA genes. 



Introduction 

Thalassolituus spp. belong to the Oceanospirillace- 
ae of Gammaproteobacteria. The genus was first 
described by Yakimov et.al. [2004), and is current- 
ly composed of two type species, T. oleivorans and 
T. marinus [1,2]. Bacteria of this genus are known 
as obligate hydrocarbonoclastic marine bacteria 
[3]. Previous reports showed that Thalassolituus- 
related species were among the most dominant 
members of the petroleum hydrocarbon-enriched 
consortia at low temperature [4-7]. In addition to 
consortia enriched with oil, Thalassolituus spp. can 
be detected in variety of cold environments as 
well [8-10]. 

Strain R6-15 was isolated from the surface sea- 
water of the Arctic Ocean after enriched with 
crude oil during the fourth Chinese National Arctic 
Research Expedition of the "Xulong" icebreaker in 
the summer of 2010. The 16S rRNA gene sequence 
shared 99.86% and 96.39% similarities with T. 
oleivorans MIL-It and T. marinus IMCC1826T, re- 
spectively. Pyrosequencing results [16S rRNA 
gene V3 region) of fifteen oil-degrading consortia 
across the Arctic Ocean showed that the dominant 



member in most of the consortia shared identical 
sequence of this strain, comprising 8.4-99.6% of 
the total reads [not published). 
Here, we described the complete genome se- 
quence and annotation of strain T. oleivorans R6- 
15, and its phenotypic characteristics. Moreover, a 
brief comparison was made between strain R6-15 
and the two type strains of the validly named spe- 
cies of this genus, in both phenotypic and genomic 
aspects. 

Classification and features 

T. oleivorans R6-15 is closely related with T. 
oleivorans MIL-It [Figure 1, Table 1). The strain is 
aerobic. Gram-negative and motile by a single po- 
lar flagellum, exhibiting a characteristic morphol- 
ogy of a curved rod-shape cell [Figure 2). Strain 
R6-15 is able to utilize a restricted spectrum of 
carbon substrates for growth, including sodium 
acetate, Tween-40, Tween-80 and C12-C36 ah- 
phatic hydrocarbons. Its growth temperature 
ranges from 4 to 32°C with optimum of 25°C. 
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Figure 1. Phylogenetic tree highlighting the position of T. oleivorans strain R6-15 relative to other type and non- 
type strains with finished or non-contiguous finished genome sequences within the family Oceanospihllaceae. Ac- 
cession numbers of 16S rRNA gene sequences are indicated in brackets. Sequences were aligned using DNAMAN 
version 6.0, and a neighbor-joining tree obtained using the maximum-likelihood method within the MEGA version 
5.0 [11]. Numbers adjacent to the branches represent percentage bootstrap values based on 1,000 replicates. 




Figure 2. Transmission electron micrograph of T. oleivorans R6-15, using a 
JEM-1230 (JEOL) at an operating voltage of 120 kV. The scale bar repre- 
sents 0.5 \im. 
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Table 1. Classification and general features of T. oleivorans R6-15 according to the MIGS recommendations [12], 


MIGS ID 


Property 


Term 


Evidence code" 






Domain Bacteria 


TAS [13] 












Phylum Proteobacteria 


TAS [14] 






Class Gammaproteobactena 


TAS [15-17] 




Current classification 


Order Oceanospirillales 


TAS [16,18] 






Family Oceanospirillaceae 


TAS [16,19] 






Genus Thdassolituus 


TAS [1] 












Species Thalassolituus oleivorans 


IDA 




Gram stain 


INegatlVc 


IDA 




Cell shape 


v^ui VcCl I una 


IDA 




Motility 


iVlOtllc 


IDA 




Sporulation 


Non-sporulating 


TT^ A 

IDA 






4-32°C 


IDA 




Temr)Grati]re ranpe 




Optimum temperature 


25°C 






lUA 






Sodium acetate, Tween-40, Tween-80, 






Carbon source 


alkanes (C12-C36) 


IDA 




Fnerc^v source 


Chemoorganotrophic 


IDA 




Terminal electron receptor 


Oxygen 


IDA 


MIGS-6 


Habitat 


Surface seawater 


IDA 


MIGS-6.3 


Salinity 


V.J 0/0 iNaCl \Vv / \) 


IDA 






Aerobic 




MIGS-22 


Oxygen 


IDA 


MIGS-15 


Biotic relationship 


ri cc 11 viiit 


IDA 


MIGS-14 


Pathogenicity 


uiiKiiown 


NAS 


MIGS-4 


Geographic location 


cnuKcni oea., Arcnc uccan 


IDA 


MIGS-5 


Sample collection time 


juiy zuiu 


IDA 


MIGS-4.1 


Latitude 


69 30.00 


IDA 


MIGS-4.2 


Longitude 


-168°59.00' 


IDA 


MIGS-4.3 


Depth 


Surface seawater 


IDA 


MIGS-4.4 


Altitude 


Sea level 


IDA 



a) Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the litera- 
ture); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a gener- 
ally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project 
[20]. If the evidence code is IDA, then the property should have been directly observed, for the purpose of this specific pub- 
lication, for a live isolate by one of the authors, or an expert or reputable institution mentioned in the acknowledgements. 



When compared to other Thalassolituus species, 
strain R6-15 differed from type strain MIL-It [1] 
in catalase, urease and acid phosphatase, and in 
the utilization of n-alkane, pyruvic acid methyl es- 
ter, D-mannitol and D-sorbitol (Table 2). Differ- 
ences were also observed with type strain 



IMCC1826T [2] in growth temperature range, cata- 
lase, nitrate reductase, urease and leucine ar- 
ylamidase and the utilization of n-alkane, pyruvic 
acid methyl ester, p-Hydroxybutjn-ic acid and D,L- 
Lactic acid (Table 2). 
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Table 2. Differential phenotypic characteristics between T. oleivorans R6-15 and other Thalassolituus species. 



characteristic 


1 


2 


3 




0.25-0.4 X 1.2-2.0 


0 32-0 77x1 2-3 1 


0.4-0.5 xl.2-2.5 


Sfllinifv/Ont'iniiitTi iw/vi 




0 'i-'i 7%/ 2 3% 


0 5-5 0%/ 2 5% 


Temperature range (°C) 


4-32 


4-30 


15-42 


Number of polar flagella 


1 


1-4 


1 


Production of 








Catalase 




+ 


+ 


Nitrate reductase 






+ 


Urease 


w 




+ 


Acid phosphatase 


+ 




+ 


Leucine arylamidase 


+ 


+ 




Carbon source 








Sodium acetate 

<J \J V.41 L4.1X1 CiV^V^ 


+ 


+ 


na 


11 aiivciiiv^ 


ri2-r'?6 


C7-C20 


ri4 and n6 

J. T Cll IvA X. \J 


i yi UVIU aClU lllcLllyi CoLCl 


w 






p -Hydroxybutyric acid 






+ 


D,L-Lactic acid 


- 




+ 


D-Mannitol 


- 


■ 


- 


D-Sorbitol 




+ 






Chukchi Sea, 


Harbor of Milazzo, 


Deokjeok island, Ko- 


Geographic location 


Arctic Ocean 


Italy 


rea 


Habitat 


surface seawater 


seawater/ sediment 


surface seawater 


G+C content (mol%) 


46.6 


46.6 


54.6 



Strains: 1, T. oleivorans R6-15; 2, T. oleivorans MIL-1^; 3, T. marinus IMCC1826^. +: positive result, -: negative result, w: 
weak positive result, na: data not available. 



Genome sequencing information 
Genome project history 

This organism was selected for sequencing on the 
basis of its phylogenetic position and dominance 
position in the crude oil-degrading consortia en- 
riched from the surface seawater of the Arctic 
Ocean. The complete genome sequence was de- 
posited in Genbank under accession number 

Table 3. Project information 



MIGS ID Property Term 



MIGS-31 


Finishing quality 


Finished 


MIGS-28 


Libraries used 


one 454 pyrosequence standard library 


MIGS-29 


Sequencing platforms 


454 GS FLX Titanium 


MIGS-31.2 


Fold coverage 


21.1 X 


MIGS-30 


Assemblers 


Newbler version 2.7 


MIGS-32 


Gene calling method 


NCBI PGAP pipeline 




GenBank ID 


CP006829 




GenBank Date of Release 


On publication 




GOLD ID 


Gi20060 




Project relevance 


Crude oil-degradation, biogeography 



CP006829. Sequencing, finishing and annotation 
of the T. oleivorans R6-15 genome were performed 
by the Chinese National Human Genome Center 
[Shanghai). Table 3 presents the project infor- 
mation and its association with MIGS version 2.0 
compliance [21]. 
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Growth conditions and DNA isolation 

Strain R6-15 was grown aerobically in 0NR7a 
medium [22] with sodium acetate as the sole car- 
bon and energy source. The genomic DNA was ex- 
tracted from the cell, concentrated and purified 
using the AxyPrep bacterial genomic DNA mini- 
prep Kit (Axygen), as detailed in the manual for 
the instrument. 

Genome sequencing and assembly 

The genome was sequenced by using a massively 
parallel pyrosequencing technology (454 GS FLX) 
[23]. A total of 140,550 reads counting up to 
78,223,504 bases were obtained, covered 21.1- 
folds of genome. The Newbler V2.7 [24] software 
package was used for sequence assembly and 
quality assessment. After assembhng, 64 contigs 
ranging from 500 bp to 304,980 bp were obtained, 
and the relationship of the contigs was deter- 
mined by multiplex PGR [25]. Gaps were then 
filled in by sequencing the PGR products using ABI 
3730x1 capillary sequencers. A total of 284 addi- 
tional reactions were necessary to close gaps and 
to raise the quality of the finished sequence. Final- 
ly, the sequences were assembled using Phred, 
Phrap and Gonsed software packages [26], and 
low quality regions of the genome were re- 
sequenced. The final sequence accuracy was ap- 
proximately 99.999%. 



Genome annotation 

The protein-coding genes, structural RNAs [5S, 
16S, 23S), tRNAs and small non-coding RNAs were 
predicted and achieved by using the NGBI Prokar- 
yotic Genome Annotation Pipeline [PGAP) server 
online [27]. The functional annotation of predicted 
ORFs was performed using RPS-BLAST [28] 
against the cluster of orthologous groups [GOG) 
database [29] and Pfam database [30]. TMHMM 
program was used for gene prediction with 
transmembrane helices [31] and signalP program 
was used for prediction of genes with peptide sig- 
nals [32]. 

Genome properties 

The properties and the statistics of the genome 
are summarized in Table 4. The genome includes 
one circular chromosome of 3,764,053 bp (46.6% 
GG content]. In total, 3,489 genes were predicted, 
3,372 of which are protein-coding genes, and 61 
RNAs; 56 pseudogenes were also identified. The 
majority of the protein-coding genes (67.07%) 
were assigned a putative function while the re- 
maining ones were annotated as hypothetical pro- 
teins. The distribution of genes into COGs func- 
tional categories is presented in Table 5 and Fig- 
ure 3. 



Table 4. Genome statistics 



Attribute 


Value 


%ofTotal^ 


Genome size (bp) 


3,764,053 


100.0 


DNA coding region (bp) 


3,315,444 


88.08 


DNA G+C content (bp) 


1,753,947 


46.60 


Number of replicons 


1 




Extrachromosomal elements 


0 




Total genes 


3,489 


100.00 


RNA genes 


61 


1.75 


tRNA genes 


48 


1.38 


rRNA operons 


4 




ncRNA genes 


1 


0.03 


Protein-coding genes 


3,372 


96.65 


Pseudo genes 


56 


1.61 


Genes with function prediction 


2,340 


67.07 


Genes in paralog clusters 


1,051 


30.12 


Genes assigned to COGs 


2,249 


64.46 


Genes assigned Pfam domains 


2,576 


73.83 


Genes with signal peptides 


338 


9.69 


Genes with transmembrane helices 


775 


22.21 



''The total is based on either the size of the genome in base pairs or on the total number of 
protein coding genes in the annotated genome. 
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37GDODD 




183O0O0 



Figure 3. Graphical map of the chromosome. From outside to the center: Genes on 
forward strand (color by COG categories), genes on reverse strand (color by COG cat- 
egories), RNA genes (tRNAs green, rRNAs red), GC content, GC skew. 



Table 5. Number of genes associated with the 25 general COG functional categories 



Code 


Value 


%age 


Description 


J 


182 


7.11 


Translation, ribosomal structure and biogenesis 


A 


1 


0.04 


RNA processing and modification 


K 


161 


6.29 


Transcription 


L 


132 


5.16 


Replication, recombination and repair 


B 


1 


0.04 


Chromatin structure and dynamics 


D 


32 


1.25 


Cell cycle control, cell division, chromosome partitioning 


Y 


0 


0.00 


Nuclear structure 


V 


28 


1.09 


Defense mechanisms 


T 


152 


5.94 


Signal transduction mechanisms 


M 


150 


5.86 


Cell wall/membrane/envelope biogenesis 


N 


85 


3.32 


Cell motility 


Z 


1 


0.04 


Cytoskeleton 


w 


0 


0.00 


Extracellular structures 


u 


83 


3.24 


Intracellular trafficking, secretion, and vesicular transport 


0 


127 


4.96 


Posttranslational modification, protein turnover, chaperones 


C 


143 


5.59 


Energy production and conversion 


G 


76 


2.97 


Carbohydrate transport and metabolism 


E 


187 


7.30 


Amino acid transport and metabolism 


F 


67 


2.62 


Nucleotide transport and metabolism 


H 


115 


4.49 


Coenzyme transport and metabolism 


I 


106 


4.14 


Lipid transport and metabolism 


P 


138 


5.39 


Inorganic ion transport and metabolism 


Q 


57 


2.23 


Secondary metabolites biosynthesis, transport and catabolism 


R 


329 


12.85 


General function prediction only 


S 


207 


8.09 


Function unknown 




1240 


35.54 


Not in COGs 
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Insights from the genome sequence 

Until now, only the genome sequence of the type 
strain T. oleivorans MIL-It was available within 
the genus of Thalassolituus [9]. Here, we com- 
pared the genome of strain R6-15 with strain MIL- 
IT (Table 6). The genome of strain R6-15 is nearly 
156 kb smaller in size than strain MIL-It. The G+C 
content of strain R6-15 (46.6%) is similar with 
type strain MIL-It [46.6%). The gene content of 
strain R6-15 is smaller than strain MIL-It [3,489 
vs 3,732). 

Strain R6-15 shares 2,995 orthologous genes with 
type strain MIL-l^. The average percentage of nu- 



cleotide sequence identity is 96.92% between 
strain R6-15 and MIL-It. In addition, DNA-DNA 
hybridization [DDH) estimate value between 
strain R6-15 and MIL-It were calculated using the 
genome-to-genome distance calculator [GGDC2.0) 
[33,34]. The DDH estimate value between them 
was 84.5% ± 2.57, which were above the standard 
criteria [70%) [35]. Therefore, these results con- 
firmed that strain R6-15 belonged to the species of 
Thalassolituus oleivorans. 









Pro- 


Protein 








Genome 


Genome 


Gene 


tein 


with 


Without 


Plasmid 


rRNA 


Name 


size (bp) 


count 


coding 


function 


function 


number 


operons 


T. oleivorans 
















R6-15 


3,764,053 


3,489 


3,372 


2,340 


1,032 


0 


4 


T. oleivorans 
















MIL-1^ 


3,920,328 


3,732 


3,603 


2,038 


1,565 


0 


4 



Table 6. Comparison of genomes between T. oleivorans R6-15 and T. oleivorans MIL-1^ 



Conclusion 

Strain R6-15 is the first strain with the complete 
genome sequence of the genus Thalassolituus iso- 
lated from the Arctic Ocean. These genomic data 
will provide insights into the mechanisms of how 
this bacterium can thrive on the crude oil in the 
polar marine environments. 
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